Analysis of a bias effect in a tree-based variable impor- tance measure
نویسندگان
چکیده
The research in the field of data mining has widely addressed the problem of variable selection and several variable importance measures have been proposed in the literature. This paper deals with a frequently used variable importance measure defined in the context of decision trees and tree-based ensemble models like Random Forests and Treeboost. The aim of this paper is to show the existence of a bias effect in this importance measure and to discuss its potentially dangerous effects on variable selection. In addition, a heuristic correction strategy is proposed.
منابع مشابه
Meta-analysis (systematic review) of profit management antecedents and explaining the effect of company size adjuster
The purpose of the present study is to meta-analyze (systematic review) of profit management antecedents and explain the moderating effect of company size. The statistical population of the article is 100 articles and dissertations published during the years 1387 to 1398. Based on the research method, 48 studies were reviewed as the final sample. The present study was done by meta-analysis usin...
متن کاملInvestigation of the Allometric Models in Estimation of Poplar (Populus deltoides) Height
One of the most important issues in forest biometrics is the use of allometric functions to estimate the tree height by using diameter-height models. Measuring the total height of trees is usually a complex and time-consuming process. In allometric functions, the diameter is measured directly but the height of the tree is an estimate of an allometric model, which will be more accurate if the cr...
متن کاملمدلسازی ارزیابی عملکرد کارکنان با استفاده از سیستمهای خبره
The aim of this study is to develop an employee performance appraisal model via expert systems. Due to the importance and the value of human resources in organizations, a capable work environment is not recognized unless it considers HR as the main drive. By the same to-ken, to utilize the HR efficiently, a performance appraisal system is needed in which practical precision and simplicity is of...
متن کاملA bias correction algorithm for the Gini variable importance measure in classification trees
This paper considers a measure of variable importance frequently used in variable selection methods based on decision trees and tree-based ensemble models, like CART, Random Forests and Gradient Boosting Machine. It is defined as the total heterogeneity reduction produced by a given covariate on the response variable when the sample space is recursively partitioned. Some authors showed that thi...
متن کاملEffect of Bias in Contrast Agent Concentration Measurement on Estimated Pharmacokinetic Parameters in Brain Dynamic Contrast-Enhanced Magnetic Resonance Imaging Studies
Introduction: Pharmacokinetic (PK) modeling of dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI) is widely applied in tumor diagnosis and treatment evaluation. Precision analysis of the estimated PK parameters is essential when they are used as a measure for therapy evaluation or treatment planning. In this study, the accuracy of PK parameters in brain DCE...
متن کامل